GoAT: File Geolocation via Anchor Timestamping 


ABSTRACT 


Blockchain systems are rapidly gaining traction. Decentralized stor- 
age systems like Filecoin are a crucial component of this ecosystem 


that aim to provide robust file storage through a Proof of Replication 


(PoRep) or its variants. However, a PoRep actually offers limited 


We introduce a new primitive, Proof of Geo-Retrievability or in 
short PoGeoRet, that enables proving that a file is located within 
a strict geographic boundary. Using PoGeoRet, one can trivially 
construct a PoRep by proving that a file is in several distinct geo- 
graphic regions. We define what it means for a PoGeoRet scheme to 
be complete and sound, in the process making important extensions 
to prior formalism. 

We propose GoAT, a practical PoGeoRet scheme to prove file 
geolocation. Unlike previous geolocation systems that make strong 
assumptions about storage providers and require dedicated anchors, 
GoAT makes minimal assumptions and uses existing timestamp- 
ing servers on the internet as geolocation anchors. GoAT inter- 
nally uses a communication-efficient Proof-of-Retrievability (PoRet) 
scheme in a novel way to achieve constant-size PoRet-component 
in its proofs. 


We validate GoAT’s practicality by conducting an initial mea- 
surement study to find usable anchors and perform a real-world 
experiment. The results show that a significant fraction of the in- 
ternet can be used as anchors and that GoAT achieves geolocation 
radii as low as 500km. 


1 INTRODUCTION 


Decentralized systems are a rapidly expanding form of computing 
infrastructure. Blockchain systems in particular have enjoyed con- 
siderable recent popularity and constitute a $2 trillion market at the 
time of writing [3]. Many decentralized applications, ranging from 
non-fungible tokens (NFT) [22] to retention of blockchain state [5], 
require a reliable bulk storage medium. As blockchains have lim- 
ited innate storage capacity, there is thus a growing demand for 
purpose-built decentralized storage systems, of which a number 
have arisen, such as IPFS [15], Filecoin [33], Sia [45], Storj [34], etc. 

Like today’s cloud storage services (e.g., Amazon S3 [11]), decen- 
tralized storage systems typically achieve robustness by replicating 
files. With this approach, even if some replicas become unavailable, 


others can be used to fetch files. To help ensure trustworthy storage 
of replicas, decentralized file systems require storage providers to 
prove retention of file replicas. Most notably, Filecoin [17] uses a 
protocol called Proof of Replication (PoRep) [24] for this purpose, 


while related systems such as Sia, Storj, etc., use similar techniques.! 
While a PoRep or related proof system can prove the existence 
of multiple copies of a file, however, its robustness assurances are 


limited. This is because a PoRep does not ensure that file replicas 


lFilecoin has the most flexible yet most computationally expensive approach 
among these systems: Its PoRep proof system works for plaintext files, while other 
decentralized storage systems only work assuming distinct ciphertext file replicas. 
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Figure 1: A prover and a Roughtime anchor are situated 300km 
apart. GoAT’s region of uncertainty is a circle of radius 2000km 
(purple) proving that the replica is in the east half of North America. 


reside on independent devices or systems. If all file replicas are stored 


on the same hard disk, for example, damage to that one device can 
destroy the file. 

In this paper, we explore an alternative approach to building 
PoReps: proving that file replicas reside in distinct geographical re- 
gions. For example, one may wish to prove that three replicas of a 
file are present in the United States, Europe, and Asia respectively. 
Such proof automatically implies the property ensured by a PoRep, 
namely the existence of three distinct replicas of the file. It also 
ensures much stronger properties than a proof of replication alone, 
namely that file replicas are stored on distinct devices and in distinct 
physical locations. These additional properties imply that the file 
can survive device failures, destructive local events (e.g., natural 
catastrophes), etc. Thus the ability to prove replica geolocation 
can greatly improve robustness in decentralized storage systems. 
Geolocation-based proofs can also incur substantially lower resource 
costs than techniques like PoReps, as we show in this paper. 

Beyond PoReps, proving storage location is useful in other set- 
tings. For example it can help prove compliance with laws specify- 
ing localized storage of certain forms of data, e.g., [28].? It can also 
be used by CDN providers to prove that they are serving data from 
geographically distributed locations according to a claimed policy. 

The goal of our work is, specifically, to build protocols to prove 
that a given file replica is stored within a strictly-bounded geograph- 
ical region. Our main building block for these protocols is a primi- 
tive we call a Proof of Geo-Retrievability (PoGeoRet). A PoGeoRet in- 
volves a single prover proving to a number of verifiers that it holds a 
file replica in a given geographical region. To ensure the practicality 
of our PoGeoRet designs we consider here, we focus on proofs in- 
volving relatively large geographical regions (e.g., thousand-mile di- 
ameter), which suffices for key applications such as file replication. 


We introduce a formal definition of PoGeoRets in this paper, 
and propose, implement, and experimentally validate a PoGeoRet 


2Some nations only require that a copy of data be stored locally whereas more stricter 
laws make transferring data abroad illegal [28]. Our techniques suffice for the former 
but the latter would additionally require the use of trusted hardware. 


system called GoAT. GoAT creates publicly verifiable file-replica 
geolocation proofs. GoAT proofs can thus be consumed by a mul- 
tiplicity of verifiers and can be used to construct a system that 
ensures the presence of file replicas in desired locations even in the 
presence of some dishonest verifiers. 

Previous works have explored internet-resource geolocation— 
both servers [32, 46] and files [18, 47]—but operate in benign cloud 
settings, e.g., assume cloud providers have normal network con- 
nectivity (even between two datacenters [18]), verifiers are close to 
providers, and/or files are stored in cloud systems whose locations 
are known a priori, etc. These assumptions make such approaches 
unsuitable for decentralized settings of the type we explore here. 
GoAT requires none of these assumptions. 


1.1 The Anchor Model 


To avoid the undesirable assumptions of previous geolocation sys- 
tems, we explore a model for GoAT that relies on a collection of 
servers called anchors. 

An anchor is a server with a publicly announced location that 
emits digitally signed timestamps on queries. That is, an anchor has 
an API that returns the current time along with a signature over 
the time and any value sent by a client. 

Anchors used in GoAT need not be in close proximity to storage 
service providers. Additionally, the main job of an anchor is not 
to geolocate entities directly, but only to provide timestamps. It is 
possible to handle a local minority of misbehaving anchors. 

Anchors can be purpose-built for a GoAT instance. We also show, 
however, that it is possible to use existing, unmodified servers, e.g., 
TLS 1.2 or Roughtime [9] servers, as anchors. Thus it is possible to 
realize GoAT with today’s internet infrastructure. 


1.2 Proving Geolocation 


In GoAT, a prover must prove proximity to an anchor. The starting 
point for GoAT is a simple, well known technique: The prover pings 
the anchor successively to get two timestamps 1, t2. If the prover 
is indeed situated close to the anchor, then the timestamps will not 
differ by much, i.e., t2 — tı < A for some small A> Identification 
of the prover is done by signing the first anchor response with 
the prover’s private key and using the signature as a nonce in the 
second ping. This form of chaining is also crucial to ensure that the 
two anchor pings are indeed made successively. 

The two anchor responses together form a proof that the prover 
is situated close to the anchor. The signature on these responses 
makes the proof publicly verifiable. Assuming that the anchor lo- 
cation is known, it becomes a proof of location for the prover. 

GoAT requires that a majority of anchors situated near a given 
location be honest. Intuitively, this localization is necessary since 
each anchor is only useful in places close to its location. 


Realizing proofs of geolocation: Several challenges arise in ba- 
sic proofs of geolocation. Existing anchors pose one key challenge. 
TLS 1.2 servers, for instance, only provide second-level timestamps, 
which is insufficient as network round-trip times are on the order 
of milliseconds. We address this challenge by introducing a tech- 
nique for amplification: Instead of pinging twice, a prover pings the 


3This is not always true due to network abnormalities. GoAT accounts for them with 
a conservative network model introduced later. 


anchor repeatedly with interrelated challenges over an extended 
time interval, e.g., a full second. Another challenge is identifying 
usable anchors. Many TLS 1.2 servers, for instance, do not return 
accurate time or have a unique location, needed for a prover to 
prove geolocation. We conduct an initial measurement study of the 
Alexa top 1M list to identify a broad network of usable anchors. 
Another important practical concern is handling network volatil- 
ity. We provide an empirical framework to calibrate the time thresh- 
old A for the prover to assert proximity to an anchor. The frame- 
work depends on factors like expected network quality and anchor 
characteristics, such as how quickly a given anchor responds. Our 
approach helps minimize false rejection rates, particularly given 
that in our protocols a brief period of good network connectivity 
(say few seconds) amidst a longer period of time (say hours) suffices 
for an honest prover to prove file replica possession successfully. 


1.3 Geolocating Files: GoAT 


To build on basic geolocation proofs and realize GoAT, our strategy 
is to interleave into the prover’s anchor pings a Proof of Retrievabil- 
ity (PoRet) [31, 40]. A PoRet proves storage of a full file replica. In 
isolation, though, it proves nothing about a file’s storage location. 
Thus the need to integrate it with a geolocation scheme. 


Making GoAT efficient: A key challenge is reducing GoAT’s com- 
munication complexity. Due to a combination of amplification, 
proof accumulation over several epochs and different anchors, the 
proof sizes quickly blow up, even with use of the communication- 
efficient Shacham-Waters (SW) PoRet [40]. Through incorporation 
of vector commitments and compression across proof instances, we 
manage to compress the size of PoRet-related proofs even across 
a sequence of pings to just a few bytes. 

Yet another challenge in realizing GoAT is minimizing the time 


taken for the operation between the two anchor pings. Slow operation— 


as caused by computing a full PoRet proof—degrades geolocation 

accuracy. We therefore introduce techniques to compute a fast 

PoRet commitment to the randomness in a SW proof between the 

two pings, without actually computing the proof. As we will show 

later, the introduction of PoRet commitments is crucial for geolo- 

cation, improving accuracy by as much as 20x in some cases. 
Figure 1 illustrates GoAT’s file geolocation capabilities. 


Defining PoGeoRet: A theoretical contribution of our work is 
in defining what it means for a PoGeoRet scheme to be secure. 
The formalization for PoGeoRet soundness is similar in spirit to 
that for PoRet but leads to interesting new subtleties. Intuitively, 
a PoGeoRet is sound if acceptance by a verifier means that a file 
F can be extracted from the prover. The key difference for a PoGe- 
oRet is that successful extraction must now be possible from the 
target location. To capture the notion of file location, we introduce 
a location-specific commitment oracle. This oracle models the PoRet 
commitment function and tracks queries made to it from within the 
target region. We say that a PoGeoRet is sound if the file fragments 
seen by the commitment oracle are enough to recompute the file. 
The choice of PoRet commitment function leads to two variants 
of GoAT with a performance, security assumption tradeoff between 
them. GoAT-H uses a hash function to commit and admits a secu- 
rity proof under common assumptions. GoAT-P uses homomorphic 


commitments to achieve at least 3x smaller proof sizes but relies 
on a new knowledge assumption closely related to KEA1 [14]. 


Contributions: Our contributions are summarized as follows: 

(1) New Security Definitions and Modeling: We define what it means 
for a PoGeoRet scheme to be complete and sound, the latter 
requiring important extensions to the classic PoRet security 
experiments (Sec. 3). We also introduce practical model variants 
for PoRet and PoGeoRets of potential independent interest that 
facilitate bootstrapping using existing servers and fast encoding. 

(2) GoAT: We introduce our Proof of Geo-Retrievability (PoGeoRet) 
protocol GoAT in Sec. 4. GoAT leverages the Shacham-Waters 
PoRet and timestamping anchors. We explore optimizations to 
reduce the size of GoAT proofs, achieving constant-size PoRet- 
component in our proofs. We prove GoAT security in App. D. 

(3) Implementation and Evaluation: To demonstrate GoAT’s prac- 
ticality we prototype GoAT-P and run a small real-world ex- 
periment using 10 TLS / Roughtime anchors (5 each in the US 
and UK) for over a week. GoAT’s prove and verify protocols 
execute in just a few seconds, with proof sizes of a few hundred 
KB. We show geolocation radii lower than 1000km, even tighter 
than required for applications like file replication (Sec. 5). 

Sec. 6 contains related work. We have also released GoAT as an 
open-source tool. 


2 PRELIMINARIES 
2.1 Authenticated time protocols 


We are interested in time protocols that are authenticated, i.e., the 


timestamp must be digitally signed. Two main options exist today. 


2.1.1 TLS 1.2. Some TLS 1.2 servers [23] embed the current time 
in seconds into the first 8 bytes of the “server random” value. This 
value is then signed and sent to the client as part of TLS 1.2 key ex- 


certificate. This trick works for Diffie-Hellman based key-exchange, 
including elliptic-curve variants, and for RSA as well. 

This functionality has always been an informal practice, and is 
‘not specified in the TLS 1.2 RFC, but is widely practiced—we found 
about 1/5 of top 500 hosts in Alexa list supported this technique. 
Finally, this method does not work with TLS 1.3 as the specification 
‘specifically deprecates it. In practice though, TLS 1.3 adoption is 


only growing slowly. And TLS 1.2 is expected to be supported by 
most websites in the near future, e.g., in April 2021, 99.4% of Alexa 
top 1M sites [1] were found to support TLS 1.2 [38]. 


2.1.2 ‘aun Roughtime [9] is a recently developed authen- 


ticated time protocol. At the time of writing, we are aware of four 
providers hosting Roughtime—Cloudflare [37], Google, Chainpoint 
and int08h. Roughtime servers 
in ps signed with a fast signature scheme (EdDSA). As the name 
“Roughtime” suggests, the protocol is only designed to provide a 
roughly accurate time, say 


say NTP. Note that GoAT does not need accurate absolute time. 


2.2 Proof of Retrievability 


Proof of Retrievability [31] schemes enable a prover to prove knowl- 
edge of a complete file replica in a communication-efficient manner. 


T «Ne: specify our 
construction of PoRet.Commit later. The PoRet.Commit function 


For GoAT, we require a publicly verifiable PoRet scheme. Merkle- 
tree (MT) based variants [31] and Shacham-Waters (SW) [40] are 
the two main choices. 

Figure 10 shows the API for a PoRet scheme. The file owner 


In a typical PoRet system, pp is posted 

publicly so that any party can verify a proof of retrievability. 
A special feature of GoAT is the introduction of an additional 
functionality in a PoRet. This functionality, called PoRet.Commit, 
commits to randomness for use in a (future) PoRet proof. We in- 
troduce PoRet.Commit to enable fast prover interaction with a 
and thus require that it be: (1) quickly com- 


is the only addition we make to the PoRet scheme used in GoAT, 
which otherwise remains unmodified. 


3 FORMALIZING PROOFS 
OF GEOGRAPHIC RETRIEVABILITY 


A Proof-of-Geographic-Retrievability (Po et) scheme includes 
three parties*: 


Desired properties: Like any security protocol, a PoGeoRet must 


satisfy two basic properties: completeness and soundness. Complete- 
ness means that the PoGeoRet scheme must succeed for any honest 


Soundness means that 


Section structure: We start with preliminaries in Sec. 3.1, explain- 
ing how a PoGeoRet leverages an underlying PoRet. We provide 
the adversarial model in Sec. 3.2, and then present the basic mod- 
eling behind our formal definitions. We formalize completeness 
in Sec. 3.3 and soundness in Sec. 3.4. Finally, in Sec. 3.5, we discuss 
modifications to our security model and definitions that we believe 
reflect requirements in real-world use cases, such as support for 
fast file encoding and easy bootstrapping. 


3.1 Preliminaries 


Protocol structure: The API for a PoGeoRet scheme is in Fig. 2. 

We assume in this API and throughout this section that a PoGe- 
oRet scheme internally leverages a PoRet scheme. In what follows, 
where clear from context, we drop PoGeoRet from our notation, 
e.g., use Setup to denote PoGeoRet.Setup. 

We define a PoGeoRet for a general setting in which a target file 
F is stored as a publicly accessible plaintext. 

A user U that wants a file F to be stored near a particular loca- 
tion runs the setup protocol (PoGeoRet.Setup) on F to generate an 


4Of course, in practice, a decentralized system will typically include many instances 
of each party type. 


Proof of Geo-Retrievability 
© (sk,pk) — KGen(1*): Generate key pair. Run by the user. 
e (F*,n,pp) < Setup(sk,pk,F): Runs setup of the underlying PoRet scheme to 
generate F*, which contains the file plus the generated data, its handle y, and 
some public parameters pp. Run by the user. 
c e Chal(7, pp): On input file handle 7 and params pp, derive a random 
challenge c. Run by the verifier. 
e 78° — Prove(y,R,c): On input file handle y, a geographic region R and a 
challenge c, generate a proof of geo-retrievability 78°°. Run by the prover. 
- Commit(p): A sub-function of Prove that operates on a file fragment p. 
0/1 — Verify (pp,R,c,8°°): The verifier checks that the file is in the desired 
region R by verifying the proof z®°° using the challenge, public params. 
F — Extract(4,R,pp): The extraction algorithm consists of two sub-functions: 
- pel! < Extr.Derive(7,R,pp): An interactive protocol run with the prover. It 
takes as input file handle, geographic region, public parameters and outputs 
a list of file fragments p". 


- Fe Extr.Assemble(j2"): Assemble file fragments to compute the file. 


Figure 2: Proof of Geo-Retrievability (PoGeoRet) API. 


encoded file F*. U then gives F* to a storage provider P situated 
near the desired location. The public parameters pp are published, 
e.g., on a blockchain. 


A PoGeoRet protocol runs in epochs. During each epoch, the 
provider P computes a Proof of Geo-Retrievability using the Prove 
protocol. An auditor V can use the public parameters pp to verify 
the generated proof via the Verify protocol. 

The key ingredient in Prove enabling a file geolocation proof is 
the sub-function Commit. It takes a file fragment p as input and 
outputs a commitment of it. 

A PoGeoRet must also specify an extraction algorithm Extract 
that can recompute the file F from the prover’s responses. Extract 
will be used to model extraction in the soundness definition of a 
PoGeoRet in a way largely similar to prior works [31, 40]. A key 
difference from prior works is that Extract needs to follows a spe- 
cific design; it must be composed of two algorithms: Extr.Derive 
interacts with the prover and outputs a list of file fragments ie 
and Extr.Assemble recomputes the file F from the fragments. We 
assume, as in, e.g., [31, 40], that during extraction, the prover can 
be rewound. 


Modeling geolocation: We model geolocation in a PoGeoRet us- 
ing a metric space [7] 


ple, 


points on a sphere, regions correspond to circles on the surface. For 
simplicity, we will only consider such circular regions. We use the 
notation R.L / R.6 to refer to the center / radius of R respectively. 
Suppose that we want the PoGeoRet scheme to facilitate storage 
of files in a target region R'@'8*t = (L;5) where 6 is a small radius 
that captures the breadth of the target region. Our definition allows 
for any arbitrary ô; in practical settings however, geolocation will 
be most beneficial for a small target region, e.g., the size of a city. 
Any storage provider located inside R'#"8*t can then join the system. 


Region of uncertainty: We define a Region Of Uncertainty (ROU) 
the permitted noise in the attained geolo- 


denoted R'" to capture 


cation guarantee. The PoGeoRet scheme then must ensure that files 


>The metric is a function that defines the concept of a distance between any two set 
members, and satisfying a few simple properties such as the triangle inequality. 


For a location L € M, we define a region R=(L;5) as the set of 
all L’ € M that satisfy dist(L,L’) < ô. For example, when M models 


Notation Description 
U User / File owner 
P Storage provider / Prover 
V Auditor / Verifier 
AT Anchors (single / set) 


Table 1: System entities. Anchors are specific to GoAT. 


are stored inside the region R". In other words, an ROU helps 
eliminate spurious proof failures. 

Continuing the previous example of earth surface as M, say we 
want to support file storage in New York City. Then the target region 
is Rtarset — (L;5) where, e.g., L= (40.73°,-73.93°) and 6=10km.° 
Suppose that we are willing to tolerate noise in proofs up to the 
point where we ensure that files are at most 1000km from New 
York. The desired region of uncertainty then is R" =(L;1000km). 

Our definitions are given with respect to a single target region 
Rtarget_Tn practice, it might be desirable to support several distant 
locations. We expect our definition to be applied to each desired 
target region independently. 

Where convenient, we refer to a region of uncertainty R'°" as 
RÌ” and define its complement by R°“t = M\ R". 


Storage devices: To allow an adversary to place files in several 
distinct locations, we introduce a model of (storage) devices. We 
denote a device by D. In our security experiments (for soundness), 
all devices are under the control of the adversary / prover. The 


prover can place devices in locations of its choice but those loca- 
tions remain fixed throughout the experiment. Devices have access 
to unlimited storage memory. Formally, we model all devices by 


way of an oracle Ogey presented in Sec. 3.4. 


Modeling time: As noted before, all PoGeoRet schemes must use 
time to distinguish between a challenge answered with a local file 
versus another answered with a file fetched from afar. To do so, each 
must have a specified expected time. We allow the adversary to 


communicate messages (of any size) between devices with speed 
Smax. In our security experiments below, we assume that the verifier 
internally keeps track of time whenever necessary. 


3.2 Adversarial Model 


Table 1 lists the entities in a PoGeoRet scheme. The adversary 
A controls the storage provider P. We assume that the auditor V 
and the user U are honest. (In a decentralized system, it is easy to 
imagine an honest-majority assumption for auditors.) 


3.3 Completeness 

Completeness requires that for all key pairs (sk, pk) output by 
KGen, for all files F € {0,1}*, and for all {F*,7,pp} output by 
Setup(sk,pk,F), the verification algorithm accepts when interacting 
with a valid prover P situated on a device D inside Riarget ie. for 
all challenges c<—Chal(n,pp), we have 


Pr lı < Verify (pp,R" ,c,7r8€°)|7r8°° — Prove(1,R'",c) l >1-negl(A). 


Device oracle Ogey 


State: A region R'. Key-value pairs D [did] =(loc,mem) where the key did 
1: is the device identifier, loc is its location, mem is a list denoting the memory. 
A list p"° to track inputs to the commitment oracle OÙ? p- 
2: init(R): Set R" =R. Not callable by the adversary. 
3: createDevice(did,loc,mem): If did € D return L. Set D [did] =(loc,mem). 


exec(did,func) — out: If did ¢ D return L. Compute func and return its out- 
4: put. func can read / write to D[did].mem or call any of Ogey functions inter- 
nally. If OÏ m is called with input p and D [did].loc € R", do p"°°.append (2). 
sendTo(did;,did2,data): If data € D[did,].mem, 
D{did2].mem.append{(data). 
6: erase(did,j): Erase index j, D[did].mem.erase(j). 


all 


7: seenInROU (p?"): Return 1 if Vy € yu", € "°° holds. 


Figure 3: The device API. 


3.4 Soundness 


Our security definition for soundness involves two security ex- 
periments: a setup experiment and a challenge experiment. The 


e-fraction of queries correctly. The challenge experiment interface 
is reused for geo-extraction, where we try to extract F using the 
Extract protocol. Geo-extraction is deemed successful only if F can 
HE 


associated with target region R'". The definition says that the Po- 


Corresponding to the setup and challenge experiments, the ad- 
versary A consists of two parts, Asetup and Achal, each involved 
only in its respective experiment. 

The adversary Asetup may interact arbitrarily with the verifier; 
it may create files and cause the verifier to run setup on them; it may 
also undertake challenge-response interactions with the verifier 


and observe if the verifier accepts or oOo ooo 
to 


The purpose of the setup experiment is to run Setup on a file F 
picked by the adversary. The resulting output F* is challenged in 
the challenge experiment. 

During the challenge experiment, challenges are issued to the 
second adversary component Achal and success is based on whether 
the proof verifies. 

During geo-extraction, the same adversary component Achal 
is reused. Geo-extraction has three steps in total. First Extr.Derive 
derives file fragments p°! We allow 
the adversary to be rewound in this step, as is standard in the PoRet 
literature, e.g., [31, 40]. Second Extr.Assemble tries to recompute 
the file from the derived file fragments 4?" Extr.Assemble does not 


interact with the adversary. The third and final step is verifying if 
all the fragments 2"! were seen inside R'". Geo-extraction succeeds 


only if step 2 and 3 both succeed. Finally a PoGeoRet scheme is 


®Tn practice, 5 would have to be decided based on the city geography. 


Exp P (R) Expshal (7,R,pp,s) 
Osey.init(R) AC (s) % Init Achal 


(sk,pk) — KGen(1*) 


c Chal(ņ,pp) % Random chal 


0 ev 
Fe gÊdev (pk) 8° — Aa -Prove(7,R,c) 


setup 


(F*,n,pp) — Setup(sk,pk,F) return Overity (pp-Rc 7°?) 
O, k 
s Asetup (F*.9,Pp) 


setup 


return (7,pp,F,s) 
Figure 4: Setup and Challenge Experiments. 


said to be sound if adversarial success in the challenge experiment 
implies that geo-extraction succeeds w.h.p. 


Commit function and 


the oracle is invoked from a device located in R'". So if Oi", was 
queried about a fragment, then it must have been inside R'", and 
consequently if enough file fragments are inside Ri", then the file 
F itself is in R". 

From the point of view of an adversary whose goal is to “cheat” 
a verifier, A wants to create an environment in which V believes 
the file is in R”, but it isn’t. Thus the aim of Asetup is to set up 


devices in such a way that: in 
F 


from the file fragments input to Of}. 


We present our detailed security experiments in Sec. 3.4.1. They 
come together in our soundness definition in Sec. 3.4.2. 


3.4.1 Soundness security experiments. We model device actions 
in our security experiments via the device oracle Ogey specified 
in Fig. 3. 


Ozey-init is used to store the target region of uncertainty 


tion. 
including any function in the PoGe- 


oRet API. Within the API, PoGeoRet.Commit is explicitly mod- 
eled using the commitment oracle 


om inside RT“. 


Note that exec can also communicate with other devices through 
Odey-SendTo (or) execute code on a different device. In all our exper- 
iments, the adversary is given complete freedom to call any device 
function. Both the experiments are in Fig. 4. 


Setup experiment Ep "P; In our first experiment, Setup is run 
over a file F and the output given to the adversary, who decides 
where to place the file. Asetup outputs state s that is given to Achal: 


Challenge experiment Exp‘*!: In our second experiment, Achal 
responds to a Commit challenge issued by the verifier. The adver- 
sary is deemed successful if it generates a response that succeeds 
with a probability at least e. Note that we issue one PoGeoRet chal- 
lenge which internally comprises of one PoRet challenge. Success 
probability for the challenge experiment is defined as: 


Succ? (y,R,pp,s) =Pr [Exp (n.Rpp.s) =1 l . 


3.4.2 Soundness definition. Our main security definition is: 


DEFINITION 1 (SOUNDNESS). A PoGeoRet scheme is (€,p)-sound 
w.r.t a target region Riarget achieving a region of uncertainty R', if 
for all poly-time A: 

p?! — Extr.Derive.q(7,R" pp), 
Pr Extr.Assemble(@!') =F, 


Odgey.seenInROU (p?!) =1 


(npp.F.s) —Exp'y (R), 
Succ? (y,R"®" pps) Se 


It states that a PoGeoRet scheme is (e,p)-sound if, for every ad- 
versary that succeeds the challenge experiment with e probability, 
geo-extraction must also succeed with p probability. Sometimes 
we omit p and say that a PoGeoRet scheme is €-sound; in such 
cases we mean that p is negligibly close to 1, i.e., p=1—negl(A). 
Note that the extraction algorithm Extr.Derive interacts with Achal 
initialized with the state s output by the setup experiment. 


The above definitions of soundness and completeness assume 
an interactive protocol between the prover and verifier. In practice, 
non-interactive schemes are often desirable. We aim to build such a 
scheme in GoAT. Due to lack of space, we provide non-interactive 
definitions in App. B.1 (they only require minor modifications). 


3.5 Practical model variants 
There are two modeling assumptions which, by assuming an eco- 


nomically rational adversary, can lead to significantly better per- 
formance, and which we therefore embrace in GoAT. The first 


assumption—a lower bound on bandwidth costs—dictates when 


lows use of internet infrastructure for effective bootstrapping. The 
second assumption—rational behavior by an adversary in file-block 


These two models are of independent interest beyond GoAT; for 
example, they could be applied to a PoRet scheme. 


3.5.1 Challenge regimes. In the previously described model, the 
We refer to this chal- 
lenge pattern as the random-challenge model. Building a practical 


PoGeoRet scheme under this model requires an existing network 
of verifiers, thereby posing a bootstrapping problem. 


we believe, is one, 


internet infrastructure. In GoAT, we derive challenges from sig- 
natures provided by internet servers. But since the verifier is not 


issuing challenges, the prover decides when the challenge-response 
interaction is going to take place. We call this model the flexible- 
challenge model, signifying the extra flexibility prover has. 


At first sight, such a model might not seem to work, as the 


response interaction. But we argue that by imposing a restriction 


We now discuss the operational and security models for flexible- 
challenge model and end with an example. 


Operational model: Time runs in epochs. Each epoch is in turn com- 


posed of I intervals, each of length £. Interval length determines 
challenge frequency, i.e., challenges are issued once per interval. 
Epoch length determines verification frequency, i.e., challenge re- 


sponses are accumulated over the I intervals are generated and 
verified only at the end of an epoch. 


IV 
$ 


Security model: Like before, we begin with a setup experiment 
where the adversary picks a file F and initializes several devices. 
But then, J challenge experiments take place, one per interval. After 
the epoch (or) J intervals end, the challenge responses are verified. 
Geo-extraction takes place after that. 

The device oracle Ogey now maintains a record of the commit- 


ment oracle queries made in each interval; let {°° denote the list 
of fragments queried in the ith interval. In each interval, the ad- 


versary requests a challenge at a time of its choosing. After the 


epoch (or I intervals) ends, we extract the file I times by running 
Extract. Geo-extraction in the ith interval succeeds if the file can 
eera = Soundness is defined in 


the same way as before except that 


The model includes a bandwidth constraint: only ¢ bytes can 
be transferred from devices in R“* to those in Ri" ($ < |F|) per 
interval. The bound ¢ is meant to reflect the economics of storage: 


orders of magnitude more than that of storage, as shown below. 


Example: For the purpose of this example, we compare the band- 
width and storage costs charged by Amazon (the storage cost is 
used as a proxy for revenue). Let’s say we set the interval length 
B=30mins. If the encoded file size is |F*|=1TB, then the storage 
revenue is at most $0.02 per interval on Amazon S3 [11]. On the 
other hand, AWS bandwidth costs start from $20 per TB.’ So down- 
loading 1GB would cost the same as the revenue obtained by storing 
1TB, and therefore ¢=1GB; more broadly the relation between the 
|F*| 


bandwidth cap and the encoded file size is given by $= qq. 


3.5.2 Rational file retention and erasure-coding. Recall that a 
PoRet encoding F* of file F incorporates an erasure code (to amplify 
soundness). To get strong soundness based on the security defini- 
tions we have given, it is essential to use a code with high distance 
between codewords—e.g., a maximum distance separable (MDS) 
code such as Reed-Solomon—treating F as a codeword. Such erasure 
coding is robust to adversarial erasures. MDS coding, however, is 
expensive in practice for large codewords, asymptotically at best 


O(nlogn) for file size n (given tolerance of a constant-fraction of 


erasures) [35, 41], and very costly in practice. (To avoid this problem, 


a number of PoRet protocols, e.g., [20, 31], have 


A second, more practical approach, we believe, is to use weaker 
erasure codes, specifically 


e.g., [42]. Such codes are far more performant for large files than 


MDS coding. They are designed, however, for noisy channels with 
random erasures, and are brittle in the face of adversarial erasures. 


Thus they provide poor security against a malicious adversary. 


however, it is possible to achieve 


good security with linear-time coding. Such an adversary may be 


viewed as seeking to maximize its financial gain and minimize its 
expenditure on storage. All other things being equal, however—for 
instance, given a certain amount of allocated storage in a given 


7 AWS bandwidth costs vary by region, ranging from $20-$100 per TB transferred [10]. 
S3 charges also vary by region, we use the maximum above. 


geolocation—such an adversary will attempt to preserve F. We 
model this formally through a greedy storage algorithm that decides 
where to place bits of a given file in a greedy way with preference 
given to devices in Rİ". The adversary still retains control of storage 
capacities of the devices, thus effectively deciding the amount of 
storage space inside R'. More details are in App. B.2. 

Assuming rationality in a provider reflects natural ecosystem 


design decisions. For example, a provider may be paid for retrieving. 
F, or may earn a reputation for reliable service. Such a provider 
has a financial incentive to ensure that a stored file F is recoverable. 
The provider will not strategically erase file blocks in an attempt to 
render F unrecoverable if it has to store the same amount of data 
anyways. Consequently, it is possible to achieve strong soundness 
using a linear-time erasure code. 


4 THE GOAT PROTOCOL 


We now present details of our PoGeoRet scheme, GoAT. We be- 
gin by discussing GoAT-specific modeling details (Sec. 4.1). Next 
we provide a brief description of the Shacham-Waters (SW) PoRet 
scheme (Sec. 4.2). In Sec. 4.3, we present the two variants of GoAT: 
GoAT-H and GoAT-P, that use slightly different variants of the SW 
PoRet scheme. In Sec. 4.4, we discuss the security guarantees GOAT 
provides and introduce a new knowledge assumption. Finally in 
Sec. 4.5, we discuss extensions supporting use of low-resolution 
anchors and ways to decentralize trust among anchors. 


4.1 System Model 


We describe modeling details specific to GOAT now. GoAT achieves 
soundness under the flexible-challenge model. All the important 
notation is tabulated in Table 2. 


Network model: We approximate the Earth to be a sphere. The 
metric space (M.dist) is defined by the set of all locations on earth 
(M) and the spherical distance function (dist). 

We assume that the maximum network speed attainable by an 
ARERR Sax. And the SE 
providers joining our system is Smin: The ratio =Simax/Smin is the 
network speedup of the adversary. These parameter values need to 
be decided based on empirical measurements (See Sec. 5). Honest 
providers only need to attain the speed Smin for a short period of 
time. For example, if interval length S=1hr, good connectivity for 
a few seconds every hour suffices. 

We also include a small startup cost tstart as it dominates the 
round trip times for nearby locations. The expected maximum 
time for a round trip between two locations L; and Lz is given by 
ttmax(L1.L2) = (2dist(L1,L2) /Smin) + tstart- 

Anchors: GoAT leverages existing internet servers called anchors. 
Anchors must provide an authenticated time API and have a static 
known location. The time need not be absolutely correct, relatively 
consistent time is allowed. Clock drift is assumed to be negligible. 


To begin with, anchors are assumed to be honest. Decentralizing 


trust in anchors is discussed in Sec. 4.5. 

Anchors serve time through an API denoted “GetAuthTime”. It 
must take as input a nonce N and return a transcript T= {M,o} 
where M = {t,N} is a message containing the time t and nonce 
(M could also contain other data), and ø is a signature over M, i.e., 
o = Sigsk, (M). The key pair (sk,,pk,) are the secret, public key 


Notation Description 
€ Frac. of queries answered correctly 
p Erasure code rate 
Lp #intervals per epoch, interval length 
R=(L;6) Circular region defined by center L & radius ô 
w Network speedup 
T Anchor transcript 
H File fragment 
n, s, k #blocks, #sectors-per-block, #challenges in SW 
(o) Bandwidth cap 
a Grinding cap (2* GetAuthTime calls) 
a Amplification factor 


Table 2: Notation 


of the anchor respectively. We assume a list of anchors 7 is de- 


imestamp resolution of an anchor 
GoAT sup- 


ports anchors of all resolutions, although smaller resolution leads 
Storage model: We assume storage providers use SSDs for storage 
(we do not support HDDs, see [36]). Modern SSDs are quite fast with 
seek times of just a few milliseconds [8] due to inbuilt parallelism. 


4.2 Shacham-Waters scheme 


At a high level, SW uses BLS-style signatures to facilitate proof 
aggregation and public verifiability. We explain important details 

of the SW protocol [40] now omitting low-level details, which can 

be found in Fig. 9 in the Appendix. 


Let T Á 


s, with each block further divided into s sectors. Each 
: i e 


Each challenge in SW.Chal consists of a block number c; € [n] 
both derived randomly. Denote the num- 


ber of challenges by k. Generating a proof in SW.Prove requires 


n= {uj}; face denotes vectors) and aggregating the cor- 
Crucially, the proof size does 


not depend on the number of challenges k. 

Modifications to SW: We add a function SW.Commit that com- 

putes and commits the value p using a vector commitment (VC) 

scheme. Note that u is computed exactly as in the proof function. 
We consider several variants of SW that differ in the choice of 


VC. The main one SW-P uses the Correlated Pedersen scheme where 


bases. That is, a 


back to the formalism from Sec. 3, the crucial Commit function in 
GoAT is vector commitment and yp acts as the file fragment. 


The verification function (SW.Verify) takes a commitment of 4 
‘matches the commitment. Other functions are unchanged. 


4.3 GoAT protocols 


We now present the GoAT protocol which relies on high-resolution 
anchors, i.e., an anchor A with millisecond or lower timestamp 
resolution (or) T4 < 1ms. A real-world example of such an anchor 
can make use of Roughtime [9], which has a 1ps resolution. Low- 
resolution anchors are considered in Sec. 4.5.1. 

Two variants of GoAT arise depending on the PoRet scheme used. 
The first, GoAT-H, uses SW-H PoRet scheme, and admits a security 
proof in the random oracle model. The second, GoAT-P, uses SW-P 
PoRet scheme and is more performant than GoAT-H with at least 
3x smaller proof sizes, but its security is based on a new assumption. 
The two schemes are largely similar, so we present GoAT-P first 
and discuss modifications needed for GoAT-H subsequently. 


4.3.1 GoAT-P. As usual, our setting involves a user U that wants 
to store a file F with a provider situated in a location Lp € R*"8*t, 
a target region. Since GoAT is a non-interactive protocol, all APIs 
have the preamble NI. As noted before, the non-interactive PoGe- 
oRet API is in the Appendix (Fig. 7). 


Setup (NISetup): U runs the PoRet setup protocol (SW.Setup) over 
F to generate transformed file F*, file handle 4, and the public 
parameters pp. Then U picks a storage provider P located at an 
admissible location Lp and sends {F*,7,pp} to P. 

As noted before, we assume that a set of anchors 7 is predeter- 
mined; let A€ 7 be one such anchor located at L4. For simplicity, in 
this section, we assume anchors are trusted and thus that it suffices 
to use the single anchor A. Other protocol parameters such as the 
interval length f, number of intervals per epoch I are assumed to 
be predetermined. 


Proof generation (N|Prove): Generating a proof of geo-retrievability 


happens in two phases. In the 


phase, the provider interacts with the anchor to obtain PoRet chal- 


. This phase 


run only once per epoch, the provider computes the full PoRets. 
Geo-commitment generation (NIGeoCommit): The key idea is to 


ooo 


Fig. 5 depicts the geo-commit protocol 


explained now: 


(1) Ping #1: Sample a random nonce Nj, and send a request 
GetAuthTime(N7) to A. Receive transcript Tı = {M1,01 } where 
M,= {t N1 } and oj; =Sigsk, (Mı). 

(2) PoRet commitment: Use o1 as randomness to derive a set of chal- 
lenges S — SW.Chal(7,pp,o1). Now generate a commitment 
com +— SW-P.Commit(7,S). (Refer to Sec. 4.2 (or Fig. 9) for a 
description of SW.Chal and SW-P.Commit.) 

(3) Ping #2: Set nonce Nz = com and ping the anchor A again via 
GetAuthTime(N2). Receive T2 = {M2,02 } where M2= {t2,N2 }. 


We refer to the pair C8°°= {Tj,T2} as a geo-commitment. Note 
that the PoRet commitment com is embedded in T2, so we do not 


explicitly mention it. By the end of an epoch (or I intervals), the 
I 
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Figure 5: The geo-commitment protocol. 


PoRet computation (NIPoRCompute): Once an epoch ends, the provider 


A naïve approach is to simply run the SW.Prove function J times 
with the same challenge sets used in step 2 of the geo-commit phase. 
But this leads to larger proofs. (Looking ahead, GoAT-H takes this 
naïve approach.) 


Instead we aggregate proofs in much the same way as SW.Prove, 
except for one key step, 
ts {rj} from the final PoRet commit- 


dorand 
e challenge set used to compute the jt} PoRet 


tment by Sj = leipo ya where j € {1,...,J}. The newly 
generated coefficients are incorporated into those for the challenge 


i * — PE n ee k 
sets, j,S = {cijrjvij}‘,. The modified challenge sets are aggre 


We give further details later. 
Given S*, the PoRet is computed as gP ORet <_ SW.Prove(y,S*). 
The full proof of geo-retrievability then consists of the I geo- 
commitments and the PoRet, 78°° = Í fee K zP ORet l. m8©° is 


m=1 
given to the auditor V for verification. 


Proof verification (NIVerify): The auditor checks anchor tran- 
scripts in the I geo-commitments using the anchor’s public key. 
Then the auditor derives PoRet challenges from transcript signa- 


tures as in proof generation. The coefficients įrj} and aggregate 
challenge set S* are similarly computed. V 


challenge coefficients because SW-P.Commit contains only linear 
operations and the vector commitment scheme is homomorphic. 
The final verification step is to check that the two timestamps are 
close in all geo-commitments, namely that t2—t < A(La,Lp), where 
A is a pre-agreed upon function that takes anchor, provider loca- 
tions as inputs and outputs the maximum runtime of NIGeoCommit 
operations (those happening between times ty, t2). We now discuss 
how A is set. Figure 12 in the Appendix specifies the GoAT protocol. 


Setting A: Deciding A requires effectively striking a balance be- 


tween completeness and soundness. To achieve completeness, A 


At the same time, A should output low enough values to prevent 
cheating, i.e., improper location of a file, by a cheating provider. 

As shown in Fig. 5, the time difference t2—t, captures the time 
taken to run two operations: a GetAuthTime API call and a PoRet 
commitment. Denote the maximum time for the two operations by 
tping and tcom respectively; we then have A(La,Lp) =tping + tcom- 

Elapsed time for the GetAuthTime API call depends on the phys- 
ical distance between the anchor and provider. We have tping = 
rttmax(La,Lp) + tproc Where the first term denotes the maximum 
round trip time introduced in Sec. 4.1 and tproc denotes the maxi- 
mum processing time by the provider and anchor (processing time 
accounts for the time taken to compute a response, see Sec. 5.2 for 
details). Therefore we have: 


A(La,Lp) =(2-dist(La,Lp)/Smin)+tstart+tprocttcom. (1) 


We later prove that for a provider to succeed in a PoGeoRet proof 
for F, most of F must be stored within A(L,,Lp)-Smax/2 distance 
of the anchor location L4. 


Crucially, the radius of the ROU grows linearly with A(La,Lp). 
This serves as a motivation to minimize computation time in a 
PoGeoRet a8 miuch as possible, Indeed, it is to reduce com that we 


introduce a commitment function as a means to commit to a PoRet 
proof before generating the proof itself. 


Grinding attacks: Since NIGeoCommit protocol is prover-initiated, 
an adversarial prover can exploit by re-running the protocol. For 
example, an adversary could save on storage by only storing a 
portion of the file, and repeatedly query the anchor until all the 
challenges lie in the stored part. 

Let g be the stored fraction. To model practical constraints, we 
assume that a prover can make upto 2* GetAuthTime API calls per 
interval (this number needs to be set based on the actual API call 
costs). The success probability after 2% API calls is p=1—(1—g*)?". 
The adversary needs to choose the file-fraction g such that p is 
Q-A ) ze ) 1/k 


non-negligible, i.e., g > (1-(1- (or) g > 274 (via 


binomial expansion). Intuitively 


raised, the adversary is forced to store more. We derive an exact 


constraint involving k and a in our security proofs. 


Coefficient randomization: Randomization at the end of an epoch 
is necessary to ensure that the PoRet commitments {com;} are cor- 
rectly computed in all intervals. If the ratio between any two ran- 
dom coefficients was predictable, e.g., say t=r;i/r; was known for 
some i< j, then an adversary could cheat by postponing file access 
required to be done in the ith interval to the jth interval. Simply set 
com; to random and com; in a way that the verification equation 
checks out, i.e., comj = (Hj(com;)~!)*Hj. H; and Hj are the actual 
ith and jth PoRet commitments that the adversary computes in 
the jth interval. More formally, we later show that an adversary 
that skips PoRet commitments cannot succeed in verification, as it 
is equivalent to breaking commitment binding, which can happen 
with negligible probability. 

We ensure a negligible likelihood of guessing the random coeffi- 
cients {r;} a priori by deriving them from the final PoRet commit- 
ment comz. This still leaves possible grinding attacks. The best strat- 


The probability of success is 2-4 (as 24 is the size of the group used). 


e.g., [4]. 


4.3.2 GoAT-H. The key difference in GoAT-H is the use of a hash 
function as the vector commitment. This results in larger proofs 
and extra computational steps in Prove and Verify. 
Geo-commitment generation (NIGeoCommit) is same as before 
except the change in the PoRet commitment function. PoRet compu- 
tation 


If S; denotes the 
jth set of challenges, compute ae <— SW.Prove(7,S;); the final 


PoRet PoRet mr PoRety Accordingly, verifica- 


prPoRet 


proof is 2 {x 


| 


The proof size and computation times in GoAT-H are same as 
GoAT-P asymptotically, but with higher constants (about 3x bigger 
proofs). Geolocation quality remains the same. 


4.4 GOAT security 
We discuss the security of GoAT-H and GoAT-P now. GoAT-H op- 


on the commonly used “knowledge of queries” technique. On the 
other hand, GoAT-P’s security relies on a new assumption that we 


introduce now, called the KEV Assumption (KEVA). 
KEVA extends the commonly used KEA1 [14] for a vector of ele- 


ments. It states that 


is ci, cy while using the same inputs 
as before. This is saying that 
by picking a pre-image x and computing its Pedersen commitment. 


DEFINITION 2 (KEVAs). Given any set of distinct bases hı € G°, 
for any PPT A, there exists a PPT extractor Eq s.t. 


x—Eg4(hy,hz), 


hi =c 


a —sZp,h2 =ht 


i (c1,¢2) —A(hı,h2),c2 =c 


>1-negl(A). 


Say the target region is a single location, R'"8*t = (L;0). Then the 
region of uncertainty achieved by GoAT-H and GoAT-P is a circle 
centered at anchor’s location with radius 6, = A(La,L)-Smax/2. In 
practice, the target region might have a small diameter, R'?"8*t = 
(L;6’). As long as 5’ is small, we can approximate and define the re- 
gion of uncertainty as R" =(A;6’’) where 8” = max { p, eptarget }OL’. 


THEOREM 1. Let w=(p+ rii" ji: For any €< 1 s.t. €—w 
is positive and non-negligible and that the CDH problem is hard in 
bilinear groups, GoAT-H is (€,p)-sound at a target geographic region 
Riarset — (L;5’) achieving a geolocation guarantee of R"™ =(A;5") 
under the flexible challenge model and the random oracle model. 


Bo y proof sketches are in App. D. 


4.5 GoAT extensions 


We discuss two extensions to GoAT: making GoAT work with TLS 
1.2 anchors and decentralizing trust among anchors. 


4.5.1 Low-resolution anchors (TLS 1.2). The NIGeoCommit 
protocol described before assumes anchors provide high-resolution 
time. But most existing anchors today such as TLS 1.2 servers only 
offer second-level resolution. 

We deal with such anchors by amplification. The idea is to chain a 
sequence of proofs. Specifically, the prover alternates between com- 
puting a PoRet commitment and pinging the anchor, effectively fill- 


ing an entire resolution tick this way. For example, TLS servers offer 
second-level resolution, so an entire second is filled with alternating 
PoRet commitments and anchor pings. The required length of the 
chain of proofs is set by the amplification factor a. a grows inversely 
SEELEY gi higher A means lower 
a and vice versa. More details can be found in App. A. In summary, 
amplification only causes a minor degradation in geolocation qual- 
ity. But the proof size grows linearly with the amplification factor a. 
One other change needs to be made to support TLS. In GoAT-P, 
the vector commitment has two elements and won't fit into the 
nonce field of the TLS handshake for commonly used groups. So 
we include a hash of the commitment and reveal the underlying 
commitment as part of the proof. Details can be found in App. A. 


4.5.2 Decentralizing trust among anchors. It is straightforward 
to consider an extension to GOAT where as long as a threshold t 
number of anchors collude, the system is secure. This would come 
at the cost of somewhat more work to provers as they would have 
to execute NIGeoCommit with t+1 anchors every interval. But the 
‘computation time is not huge (See Sec. 5.2). 

chors. Previously each anchor produced a circular ROU centered 
at its location, but with t+1 anchors, the new ROU is the union of 


5 IMPLEMENTATION AND EVALUATION 


We implemented the more efficient variant of GoAT, GoAT-P, in 
approximately 2500 lines of C with support for both TLS 1.2 and 
Roughtime anchors. Our implementation uses TLSe [44] for TLS, 
Roughenough [43] for Roughtime and Relic [12] for pairings. We 
optimize the implementation of NIGeoCommit using the asynchro- 
nous I/O library libaio [6] and POSIX threads. 


Section structure: In Sec. 5.1, we discuss a number of setup con- 
siderations, and in Sec. 5.2, we present our evaluation results. 


5.1 Setup considerations 


For the purposes of this paper, we only aim to demonstrate the 


feasibility of our approach. We thus set parameters conservatively, 
favoring strong completeness with somewhat looser geolocation 


bounds than may be achievable in practice. For more aggressive 


parametrization, a detailed internet measurement study is needed. 


5.1.1 Network parameters. We set the maximum network speed 

of an adversary Smax = %c where c is the speed of light. This is the 
[32]. 

Estimating the minimum speed for an honest user Smin can be 

tricky due to inconsistent network quality across locations. Based 

on RTT data from Wonder Network [39], we set Smin = 9¢ i.e., 


speedup COS endl eel and the constant startup cost 


These parameter choices are consistent with recent work [19] that 
estimates the median RTT between PlanetLab nodes® and popular 
websites to be about 3.2x slower than speed of light; so Smin = 75 
is conservative. These parameters worked consistently across our 
experiments, and we emphasize again that our flexible-challenge 


model permits a prover to make multiple proof attempts over a 


5.1.2 Existing anchor discovery. To show that there is an existing 
network of servers that can serve as GoAT anchors, we perform a 
limited measurement study of existing TLS and Roughtime servers. 

In this study, we identify servers that return the correct time 
and have unique locations. We obtain server locations from an IP 
geolocation database, IP2Location.’ We verify location uniqueness 
heuristically by finding each server’s ISP and making sure it does 
not belong to a Content Distribution Network (CDN) [2]; servers 
that use CDNs do not have a fixed location since they respond from 
a replica closest to the query point. A stricter approach would be 
to perform a delay-based geolocation experiment validating that 
the server location is unique, e.g., [32, 46]. We do one such exper- 
iment for Roughtime on a small scale. For TLS 1.2 and Roughtime 
respectively, our findings are as follows. 


TLS 1.2: We focus on domains belonging to educational institutions, 
as we find they are more likely than other domains to have unique 
physical locations. We take the first 2850 domains from the Alexa 
top 1M list [1] containing the substring “edu”. We retain only those 
servers that return the correct time and whose ISP does not belong 
to a CDN provider. The result is a set of 300 domains that can be 
used as anchors, i.e., 10.5% of our original list. But this list is heavily 
biased towards anchors located in the U.S. (60% of the 300). So to find 
anchors for a different location, we apply more specific filters—e.g., 
to find anchors in UK we search for domains ending with “.ac.uk”. 

We also limit ourselves to using only those TLS 1.2 servers that 
use RSA for authentication. This is done purely for implementation 
convenience. We find that the proof transcript length for RSA-based 
servers is 389 bytes, which includes a 256-byte signature. 

TLS uses TCP in the transport layer. Therefore in a standard TLS 
connection, it takes two round trips to get time: the first round trip 
establishes a TCP connection while the second gets the time. An 
important trick for better geolocation accuracy is to open all TCP 
connections prior to the start of the NIGeoCommit protocol. In our 
implementation, we open n sockets to the anchor in parallel if n 
pings are needed in NIGeoCommit. 


Roughtime: We are aware of the existence of four Roughtime 
servers as of Apr. 2021. All of them return correct time with mi- 
crosecond granularity. To check that their locations are unique, 
we perform a small geolocation experiment by sending an ICMP 
ping request from two vantage points: North Virginia (NV) and 
Singapore (SP). In this process, we identify one of the servers as 
unusable for geolocation, as it has a RTT of 17ms from NV and 
30ms from SP, suggesting it is sitting behind a CDN provider. We 
find that the proof transcript length for Roughtime is 360 bytes. 


8PlanetLab nodes tend to be well-connected to the internet, matching our expectation 
of storage provider’s connection. [19] also picks geographically diverse nodes. 
°These databases are known to have some errors [26] and a rigorous geolocation 
experiment like [46] would have to be done before deploying our system. 


Anchor processing times: 

called the anchor pro- 
cessing time (taproc)- This is measured by pinging 114 servers at 
repeated intervals over two weeks both via TLS (with TCP connec- 
tions established apriori) and ICMP (for raw RTT). The processing 
time is defined as the difference between the two. We compute 
the average processing time for each server, and then the 75th 
percentile over all the servers, which is oe 
the remaining 25th percentile are discarded. Note that setting a 
somewhat high value of 6.5ms for all TLS servers is conservative—a 
better approach is to set anchor-specific values. 

For Roughtime, we find that the processing times are almost 
negligible, we set tapiae =2ms. This could be due to a combination 
of several factors, e.g., less load, faster transport layer (UDP) [19] 
and faster signature scheme (EdDSA). 


=6.5ms. Anchors in 


5.1.3 GoAT parameters. We talk about how various parameters 
in GoAT are set now. App. C discusses some associated tradeoffs. 

For SW PoRet, we use the BLS12-381 curve. Except in one exper- 
iment below, we set the number of sectors per block, s=96. 


As we discuss in Sec. 4.4, (p+ A + (1-27 At@)/k))k needs to 
be negligible. Assuming the grinding constraint æ =40, one set of 
parameters to achieve 128-bit security are code rate p = 0.33 and 
number of challenges k=250;'° note that the bandwidth cap is set 
to #=0.001|F*| using the economic analysis from Sec. 3.5.1. 

For the experiments below, we set the number of challenges k= 


100. We expect minimal impact on results due to the slightly lower k. 


Remaining parameters: In eq. (1), two more parameters remain to be 
set, tproc and tcom. Note that we separate the processing time tproc 
into client (tcproc) and anchor (taproc) components, with the latter 
discussed before. tcproc corresponds to the time spent in handling 
the anchor response. We set feproc =1.5ms and teom =2ms based on 
code benchmarks (the latter is discussed below). 


5.2 Evaluation 


We evaluate GoAT through several benchmarks and perform a real- 
world experiment over a week (Sec. 5.2.1). For most benchmarks, 
we use an AWS c5.4xlarge machine with 16 CPU, 32GB RAM and 
2TB io2 SSD that is capped at 20k IOPS. The io2 SSD is only used for 
experiments with small duration as it is more expensive, whereas 
for the long experiment in Sec. 5.2.1, we use a 100 IOPS, 30GB gpt2 
SSD. We do not expect this decision to have a significant impact as 
we show below that the effect of file sizes is negligible. 50 samples 
were taken in all experiments to compute the mean and standard 
deviation (shown in brackets). 


PoRet commit time (vs) file size: As explained in Sec. 4, PoRet 
commit time has a direct impact on the ROU radius. Table 3 presents 
the time taken to compute the PoRet commitment as a function of 
file size (128MB to 256GB). The times are all small (1-4ms) thanks 
to our parallelized implementation (we set tcom=2ms which works 
for files below 16GB). Of the numbers shown, about 1ms is spent 
on the actual commitment computation, while the rest is for file 
reads. We use x64 Assembly accelerated code provided by Relic 
for EC operations and further optimize it using a multi-threaded 
implementation (by breaking up a vector into smaller ones). The file 


10 Another set of parameters with higher code rate is p=0.5, k=360. 


File size , Time (ms) 
128MB | 1.09 (0.02) 
1GB | 1.02 (0.02) 
4GB 1.02 (0.02) 
16GB | 1.04 (0.02) 
64GB 4.27 (0.22) 
256GB | 4.06 (0.22) 


Table 3: Time taken for PoRet commit with standard deviations. 


#intervals PoRetCompute (ms) Verify (ms) 

1 18.11 (0.06) 47.28 (0.02) 

10 183.65 (0.43) 320.81 (5.59) 
100 1,838.44 (0.73) 2,991.14 (61.03) 


Table 4: Computation time of PoRetCompute and Verify (vs) no. of 
intervals per epoch. Standard deviations in brackets. 


Protocol Anchor type Proof size 
GoAT-H Any Tae(s+1)+I(a+1)|T| 
GoAT-P TLS 1.2 (s+1)e+2Iae+I(a+1)|T| 
GoAT-P Roughtime (s+1)e+21|T| 


Table 5: GoAT proof sizes. e stands for the size of a single element 
in G or Zp of the Shacham-Waters PoRet scheme. 


read times are largely constant except for an abrupt jump at 64GB. 
This happens because the cache is no longer useful and therefore 
we switch to using Direct I/O.!! 


Computation costs: Table 4 presents the time taken for the Prove 
and Verify operations. Here we assume a fixed epoch length and 


vary the number of intervals. Recall that with more intervals per 
epoch, the location guarantee gets better. As shown, with 100 in- 
tervals, Prove takes about 2s and Verify takes around 3s. Concrete 


costs are negligible for both operations (our AWS instance cost us 
$0.376 per hour). Also note that the effect of number of intervals 
on both Prove and Verify computation times is close to linear. 

In the above experiment, we set the amplification factor a to 1. 
A similarly linear effect is expected if a is varied. 


Communication costs: With Roughtime anchors and the BLS12- 
381 curve, GoAT-P proof size is 1941 +7201 bytes. The first half 
of the equation (constant part) is contributed by the PoRet proofs, 
while the second half by anchor transcripts. If 1= 100, GoAT-P proof 
‘size is 72.2KB with a dominating 70KB of anchor transcripts. 

To show the dominant effect of anchor transcripts, we fix the 
interval length to f =1hr and plot the proof size per interval as the 
epoch length is increased. The effect can be clearly seen in Fig. 6. 
The plot converges at 720B, the size of 2 Roughtime transcripts. 

The proof sizes for all GoAT variants are in Table 5. Using same 
parameters as before (I= 100), if a TLS anchor (amplification a=20) 
is used, the proof size of GoAT-P is 799.64KB. Whereas the proof size 
of GoAT-H is around 4.5MB for TLS anchor (about 5.7x bigger than 
GoAT-P) and 265.5KB for Roughtime anchor (about 3.6x bigger). 


"Direct I/O (the “O_DIRECT” flag) is a way to avoid entire caching layer in the kernel 
and send the I/O directly to the disk. 
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Figure 6: Proof size per interval of GoAT-P with a Roughtime (RT) 
anchor against the epoch length. Interval length J = 1hr. Dashed 
line (720B) corresponds to the size of two RT transcripts. 


5.2.1 Experiment. We devise a small experiment to demonstrate 
the practical feasibility of GoAT, specifically how it deals with 
network volatility. We focus on the NIGeoCommit protocol alone 
as it is the sole operation affected by network conditions. Prior 
works [29] have observed network stability over long time periods, 


and conclude that network instability is frequent but most often 
transient. So we handle failures in NIGeoCommit by simply retry- 
‘ing until success. Concretely, the number of retries is capped at 30 
with a gap of 1 second between retries. In this process, we count 


the number of retries needed to succeed and the false rejection rate, 
if any. Under ideal network conditions, 0 retries are expected. 

We run the prover from two AWS instances located in North 
Virginia (NV) and London (LON). Five anchors, screened for the 
criteria described above, are picked near each. The interval length 
is set to 8=30mins and the NIGeoCommit protocol is run for 10 
days at NV (525 intervals) and 7 days at LON (347 intervals). 

Table 6 shows the ten anchors used (the three Roughtime anchors 
are identifiable by their prefix). The first five anchors are used with 
the AWS instance in NV, the remaining with the one in LON. The 
2nd column shows the distance d between the anchor and AWS 
instance (provider), the 3rd column shows the amplification factor a, 
and the 4th column shows the ROU radius ô along with the ratio 6/d. 
These three column values are computed as previously described. 


Geolocation accuracy: The ROU radius to distance ratio d/d 
is useful to understand the key factors contributing to the qual- 
ity of geolocation. Recall from Sec. 4 that this ratio is given by 
(w È ((tcom + tstart + tproc)Smax/2dist(La,Lp))). For providers far- 
ther away from the anchor, we see the ratio converging to speedup 
w = 3 suggesting that distance-to-the-anchor is the dominating 
factor. But for nearby providers, we see high ratios going up to 46; 
so the worse geolocation is caused by constants like fstarts tproc- 

As noted before, we choose parameters quite conservatively. If 
finer geolocation is desired, aggressive parametrization can help. 
For example, the variable tstart (startup cost) alone is responsible for 
nearly half the geolocation radius of “roughtime.chainpoint.org”. 
It can be reduced with a refined network model, as we found that 
only some anchors require this extra time. App. C discusses various 
other optimization strategies. 

TLS anchors achieve somewhat worse geolocation compared to 
Roughtime ones due to the higher processing times; for example, 
compare “american.edu” and “roughtime.chainpoint.org”. 


Robustness of our network model: The last column in Table 6 
shows a statistical picture of the number of retries required to suc- 
ceed during the experiment period. The anchor “holycross.ac.uk” 


Anchor name Distance a ROU radius (5/d) #retries (SD) 


roughtime.chainpoint.org 46.00 1 1187.27 (25.81) 0.06 (0.23) 
roughtime.sandbox.google.com | 115.33 1 1395.26 (12.10) 0.02 (0.13) 
www.american.edu 43.99 60 1665.51 (37.86) 0.03 (0.47) 
www.sunysuffolk.edu 450.29 34 2939.14 (6.53) 1.00 (0.06) 
roughtime.int08h.com 1582.83 1 5797.76 (3.66) 0.01 (0.11) 
holycross.ac.uk 35.26 61 1638.21 (46.46) 0 (0) 
sruc.ac.uk 58.83 58 1722.94 (29.29) 0.67 (1.04) 
gold.ac.uk 87.45 55 1816.92 (20.78) 1.02 (0.15) 
nott.ac.uk 175.19 48 2081.89 (11.88) 2.26 (1.76) 
www.ed.ac.uk 533.67 31 3223.57 (6.04) 0.003 (0.05) 


Table 6: The ROU radius (ô) and the distance b/w anchor and closest 
AWS instance (d). All distances are in km. Last column shows the 
mean, standard deviation (SD) of the number of retries. 


behaved perfectly requiring no retries throughout. Whereas the 
anchor “www.sunysuffolk.edu” was the only one to fail—it failed 
in 4 of the 525 intervals, i.e., 0.7% false rejection. The four failures 
happened in consecutive intervals suggesting a period of bad server 
response times. We expect system designers to select several an- 
chors in each location to avoid false rejections in practice. The 
maximum number of retries required to succeed was 13 (seen once 
with “sruc.ac.uk” and “nott.ac.uk’). 


6 RELATED WORK 


A long line of works aim to prove correct file storage by a storage 
provider, e.g., Proof of Retrievability [31, 40], Proof of Data Posses- 
sion [13, 27] and more recently Proof of Replication [16, 21, 24, 25]. 

To the best of our knowledge, only few works [18, 47] aim to 
prove file location. [18] works with small files as they rely on di- 
rectly fetching file parts and leave open the task of combining 
a PoRet with a proof-of-location (PoL). [47] combines Shacham- 
Waters PoRet scheme with a PoL, making it the closest to our work. 
But they make benign assumptions about storage providers, e.g., 
providers operate at normal network speeds. In contrast, GoAT con- 
siders faster speed-of-light providers and yet achieves geolocation 
accuracy similar to [47]. Digging deeper, this is due to our novel use 
of fast PoRet commitments whereas [47] naively combines the SW 
PoRet and PoL protocols. [47] also entrusts anchors with proof veri- 
fication making the use of legacy anchors impossible unlike GoAT. 

Most geolocation technologies in use today (e.g., GPS, Bluetooth 
beacons [30]) rely on trusted verifiers and are hence unusable in 
decentralized systems. 


7 CONCLUSION 


We have presented GoAT, a practical Proof of Geo-Retrievability 
(PoGeoRet) scheme for file geolocation. GoAT leverages timestamp- 
ing internet servers for proving location and the Shacham-Waters 
PoRet scheme for proving file retrievability. We formalized the no- 
tion of PoGeoRet soundness by extraction from devices located 
within a geographic boundary. We also presented a few practical 
model variants that facilitate realization of GOAT. GoAT has a 
unique challenge model that permits batching proofs over several 
intervals and verifying them at the end of an epoch. GoAT proofs 
are small due to aggregation of PoRet proofs across the epoch. We 
have demonstrated GoAT’s practicality through a fully functional 
implementation and a real-world experiment. 
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Non-interactive Proof of Geo-Retrievability 


(sk,pk) — NIKGen (14): Generate key pair. Run by the user. 

(F*,n,pp) — NiSetup(sk,pk,F): Runs setup of the underlying PoRet scheme 

to generate F*, which contains the file plus the generated data, its handle y, 

and some public parameters pp. Run by the user. 

m8°° <— NIProve(y, R): On input file handle 7 and a geographic region R, 

generates a proof of geo-retrievability 7®°°. Run by the prover. It consists of 

two sub-functions: 

- C° e NiGeoCommit(7, R): On input file handle 7 and a region R, 
generate a geo-commitment C®°°. An interactive protocol between the 
prover and anchor. Furthermore, the protocol com <— NICommit( y) is a 
sub-function of NIGeoCommit which takes a file fragment p as input and 
generates a commitment com. 

- mRet — NIPoRCompute(y,S): On input file handle y and a set of PoRet 
challenges S, compute one or more proofs of retrievability. 

0/1<NIVerify(pp,R,c,28°°): The verifier checks that the file is in the desired 

region R by verifying the proof zr8°° using the challenge, public params. 
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Figure 7: NIPoGeoRet API. 
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SUPPORTING TLS 1.2 ANCHORS 


1 Low-resultion anchors 


This section deals more broadly with supporting low-resolution 
anchors. 


Chaining of the two operations is done in a similar fashion to 


before. In total, a PoRet commitment computations and a+1 anchor 
pings take place. We refer to a as the amplification factor. Note 


Geo-commitment generation (NIGeoCommit) between P and A 
Prota: Follow the standard protocol (TLS 1.2 / Roughtime). 
Protp: On input {7,pp }, runs the below protocol. 
If I, < 1ms, set the amplification factor a=1. Or else a= [T\/A(A,P) ]—1. Select 
a random Nj, i=1 and do the following: 


(1) (Anchor ping) Request time from the 
A.GetAuthTime(Nj). If i=a+1, then break. 

(2) (PoRet commitment) Run S; — PoRet.Chal(y, pp, ci), com; — 
PoRet.Commit(7,S;). Set Nj; =com;, i=i+1 and repeat from step 1. 


P saves C8®°= {T; je where T; = {t;,Nj,0; } denotes the transcript. 


anchdit;,N;j,0; } = 


Geo-commitment verification by V 


On receiving seed, epoch number e, interval number m, anchor public key pk, 


and the geo-commitment C8°° = { {T }24, {com; }%, }, the auditor V does: 


e Set N; =seed and Vi € [1,...,a],Nj+1 =comj. 

Verify anchor signatures using pk,, Vi € [1,....a+1], Verify pk, (oi, {ti,Ni }) 
where T;= {tici}. 

Check that the time corresponds to epoch e, interval m. 

Check that the timestamps are close: 

— If a> 1, check that the time is same, tı = t2 =... = ta+1- 

- If a=1, check that t-t, < A(A,P). 


Figure 8: Geo-commitment protocols. 


Shacham-Waters PoRet scheme 
Scheme parameters: A bilinear group (p,G,G/ ,e,g). Number of challenges k. 
Number of sectors per block s. An erasure code with rate p. SW-H uses VC = HVC 
and SW-P uses VC=CPVC. 


e (sk, pk) — SW.KGen(14): Pick key pair (ssk, spk) — KGen(1“). Choose 
a € Zp at random and compute g“ € G. The secret key is sk = (ssk,a) while 
the public key is pk=(spk,g“). 

(F*,7, pp) <— SW.Setup(sk, pk, F): Apply erasure code over F to obtain F’. 
Split F’ into n blocks, each s sectors long {mij } with mij € Zp. 


Pick s elements at random {u;};_, € G. For each i € {1,...,n } compute 
ci — (H(i). j= iD where H is a hash-to-group function. Denote 
F* as the file together with o;,1 < i < n. The public params‘ pp contains 
{pk,n, {ui }4_, } along with a signature generated with ssk. 7 =H(F*). 

{ci,0; +e <— SW.Chal(4,pp,seed): Derive k values c; € [n], vi € Zp from 


the input seed. Return {c;,v; Vee 


Cp — SW.Commit(y, {civi }K,): Compute Vj € [1,...,s], 4j — zhom, 
, é s 

where mj = Mc)j- Commit to the vector yp = {uj Fa by 

Cp VC.Commit(p). 

SW.Prove(n{ci,vi Ei: Compute o = 


en e mk, rii) and 


Vje [1.5], j ezk oim, where m;; =M(c;)j- Output m= {p,0 } where 
s 

H= {u Ear 

0/1 = SW.Verify(pp,{c;,vi Je , Cy, 7): Receive 1 = {p,0 }. Check 

signature on ¢ with spk and parse it receive {u;i}; Check if 


e(o,g) =e(TTk, (H(c;))% sauti pk). Check if VC.Verify(p1,Cy) =1. 


“Referred to in the original Shacham-Waters paper as tag. 


Figure 9: The Shacham-Waters PoRet schemes with an extra 
commitment step. SW-H, SW-P differ in the choice of VC scheme. 


that this modification applies to both GoAT variants, GoAT-H and 
GoAT-P. 

The value a is set based on the exact resolution offered by an 
anchor. For example if the anchor resolution is in seconds and 
the time difference A(La,Lp) is 50ms, then 20 consecutive proofs 
(when started at a one-second boundary in the anchor’s clock) 
will have the same timestamp, so a = 19 (since a+ 1 pings are 
needed). More generally, if the resolution of an anchor is T4, we 


Proof of Retrievability 

e (sk,pk) — PoRet.KGen(1“): Generate key pair. 

e (F*,n,pp) <— PoRet.Setup(sk,pk,F): F* contains the encoded file, 7 denotes a 
unique file handle and pp contains the public parameters. We model the public 
key pk as a part of pp. 

e F e PoRet.Extract(7, pp): An interactive function between a prover and 

verifier to recover original file F. 

c< PoRet.Chal(1,pp,seed): Derive a challenge c from the input seed for the 

file y. 

e C e PoRet.Commit(7,c): Generate a commitment C to the proof based on 
the challenge c. 

e a+ PoRet.Prove(7,c): Generate a proof z based on the challenge c. 

e 0/1 < PoRet.Verify(pp,c,C,2): Verify both the commitment C and proof m 
using the public parameters pp. 


Figure 10: Publicly verifiable PoRet API. PoRet.Commit is the only 
addition compared to prior modeling [31]. 


Correlated Pedersen commitment CPVC 
Params: Group G and it’s support Zp. Supported vector size s. Generators 
(hi,h2,...hs)— G and (hf,hg,...,.hg) —G. 
e Cy—CPVC.Commit(v): Receive v= {v; }j_-, where Vi,o; € Zp. Output 
Co = (Mh Minsky). 

e 0/14 CPVC. Verify (v,Cy): Check if Cy = (TT, h;', 

Hash-based commitment HVC 
e Cy HVC.Commit(v): Output Cy =H (v). 
e 0/1<-HVC.Verify(v,Cy): Check if Cv =H (v). 


s h”). 


i=1 "i 


Figure 11: Pedersen and Hash-based Vector Commitment scheme 


set a=|T,/A(La,Lp)|—1.'* Below, we explain how to time proof 
execution in order to ensure receipt of a+1 transcripts with the 
same timestamp. 

In Prove, the prover computes a single PoRet similar to before, 
leveraging the aggregability of SW. We also make a change to Ver- 
ify: instead of checking the difference between timestamps, the 
verifier counts if a+1 anchor transcripts have the same timestamp. 
Other steps are similar to before. 

A general NIGeoCommit protocol for any anchor, low- or high- 
resolution, is specified in Fig. 8, in the paper appendix. 


When to start execution?: We have a question of when to initiate 
the protocol so that a+1 anchor transcripts have the same time. A 
simple approach is to continue executing proofs for roughly double 
the amplification factor a, specifically to use an augmented amplifi- 
cation factor a’ =2[T,/A(La,Lp)]—1. Irrespective of the start time 
in this case, the resulting sequence of transcripts are guaranteed 
to contain a (a+1)-length sub-sequence with the same timestamp 
(given stable network conditions). The final proof will only include 
the desired sub-sequence; extra transcripts can be discarded. The 
intuition here is that a’ executions guarantees seeing two time 
changes (i.e., three distinct timestamps), therefore one resolution 
tick is fully covered, which in turn guarantees |T /A(ZLa,Lp)]| tran- 
scripts will have the same timestamp. As noted, this will not work 
if the network conditions are unstable, and other mechanisms like 
retries are needed in practice. 


Effect on geolocation: The use of amplification has a small ef- 
fect on the radius of ROU, explained through an example. Suppose 
A(La,Lp) = 250ms, T4 =1000ms. Applying the above formula, we 


2% theory, a = [T,/A(La,Lp) | also works as a- A(La,Lp) < Th. But for perfect 
divisors, e.g., A(La, Lp) = 50ms, this can only be achieved with perfect time 
synchronization and ideal network conditions, making it impossible in practice. 


NIPoGeoRet scheme between U, P 
Scheme parameters: List of anchors / and their corte pondidg public keys. 


Interval length J, number of intervals I. 
Proty,Protp,Proty : 


e (sk,pk) — KGen(12): U runs (sk,pk) — PoRet.KGen(12). 

e (F*,n,pp) — NISetup(sk,pk,F): U runs (F*,7,pp) — PoRet.Setup(sk,pk,F) 
and picks a geographic region R = (L;ôL). Values {F*,pp,R} are given to P 
situated at L. 

e C° < NIiGeoCommit(y, R): P selects an anchor A € J based on the 
input region R and generates geo-commitments via NIGeoCommit (See 
Fig. 8) once every interval until the end of epoch. Note that NICommit(p) 
= PoRet.VC.Commit(1). 

e 7PoRet e NIPoRCompute(7,S): Let S= {Sj yy denote the PoRet challenge 

sets derived in NIGeoCommit. The process differs by the PoRet scheme: 

- SW-H: Run Vj acyoRet <— SW-H.Prove(1,S;). gphoRet — {pore Ta 

- SW-P: P derives N random coefficients in Zp from the last PoRet commit- 

A < PRF(comy). Denote S; = {cijo} 


ment, {rj . Apply random 


coefficients, Vj, S; = {Cij, 172i; W and merge all the sets to create, 
S* =UN S}. Compute sPoRet — SW-P.Prove(1,.S*). 
m8°° — N|Prove(7,R): The two sub-functions of NIProve are specified above. 
The proof of geo-retrievability 78°° consists of geo-commitments output by 
NIGeoCommit and the proof(s) of retrievability output by NIPoRCompute. 
e 0/1 — NiVerify(pp, R, 28°°): Unpack 78° = ur H, T _,, mPRet}, The 
geo-commitments are verified first using the protocol in Fig i 8. Then the proof 
of retrievability 7r”°R®t is verified using the steps described below. 
- SW-H: Check Vj, SW-H.Verify(pp, cj, com, ah oRet) = 1 where 
grPoRet — R ae 


- SW-P: Denote nPar = {p,o}. V generates random coefficients {rj Fa 
and aggregate challenge set S* similar to how P does in Prove. V computes 
C= me (com;)’) and checks if SW-P.Verify(pp,S*,C,7°R*) = 1. 


Figure 12: The GoAT proof of geo-retrievability schemes. It includes 
both the GoAT-H and GoAT-P variants that internally use SW-H and 
SW-P PoRet schemes respectively. 


get a=3, i.e., 4 pings are needed. But this leaves some “extra time’— 
for example, if the anchor’s clock times at the moment of receipt 
of the 4 GetAuthTime requests are x, x +250, x+500, x +750 (all 
in ms), then an adversary still has about 250ms left in the end (As- 
sume x is a second boundary). So an adversary can spend an extra 
250/a=83.33ms on each of the a PoRet commitment computations 
and thus position the file further from the target location than with 
no amplification. Such manipulation will go undetected because 
the difference between the last and first anchor clock times is still 
within a resolution tick, 750+83.33-3=999.99ms <I. 

The precise extra time available due to amplification is e = 
Ta —a- A(La,Lp). Distributing it equally leads to an extra e/a 
time per commitment computation. For practical values, the ex- 
tra time is small and hence its impact is minimal. For example, if 
A(La,Lp) =50ms and T4 =1000ms, then e=50/19=2.6ms causing 
about 260km increase compared to that without amplification. 


GoAT security: Considering both high-resolution and low-resolution 
anchors, the following equation describes GoAT’s geolocation radii. 

Say the target region is a single location, R'@"8*' = (L;0). Then 
the region of uncertainty achieved by GoAT (both GoAT-H and 
GoAT-P) is a circle centered at anchor’s location with radius ôr 
given by: 


A(La,L) ‘Smax/2 
(Ta/(LT4/A(La.L) ]-1))-Smax/2 


if Ta <1ms. 


otherwise. 


A.2 Changes to Commit 


As noted in the main body, the PoRet commitments in GoAT-P 
won't fit into the TLS nonce field. For example, the size of each 
group element in our implementation is 20 bytes, so the SW-P PoRet 
commitment is 40 bytes whereas the TLS nonce is 32 bytes only. 

So we modify the PoRet commitment protocol by hashing the 
previous commitment to fit in the nonce field (which is essentially 
in turn modifying the Commit protocol). The output of the Prove 
protocol, i.e., the PoGeoRet proof will now include all the PoRet 
commitments generated during the epoch. 

If the number of intervals is 1, the proof will consist of a proof- 
of-retrievability, a PoRet commitments and a+1 anchor transcripts. 


B FORMALISM EXTENSIONS 


B.1 Non-interactive 
Proofs of Geo-Retrievability 


Non-interactive Proofs of Geo-Retrievability or NIPoGeoRet allows 
any newcomer to verify that the prover indeed had the file inside 
the region of uncertainty (ROU), during a specified time duration. 
The NIPoGeoRet API (Fig. 7) is almost the same as the PoGeoRet one 
except that the function Chal is removed. We attach the preamble 
NI to other API functions, e.g., NIProve and NlVerify. 


Relation to GoAT: Recall that GoAT is a non-interactive protocol. 
So the API in Fig. 7 map to the GoAT protocol specified in Fig. 12. 

For ease of explaining GoAT, we divide NIProve into two sub- 
functions, NIGeoCommit and NIPoRCompute. The former speci- 
fies the interaction with anchor A to derive challenges. NIGeoCom- 
mit for GoAT is specified in Fig. 8. 


Modeling time: In our previous modeling for interactive PoGe- 
oRet, we relied on the verifier to keep track of time during the 
security experiments. Instead now we introduce a notion of time 
into the definition. Each system entity maintains an internal clock. 
Clocks need not be synchronous, but we assume that clock drift 
is negligible. The clock time of say an anchor A is given by time,. 
If the true time is given by true_time, then the clock offset of an 
entity A is (time, —true_time). The offsets of all anchors are as- 
sumed to be public (this can be observed once during a setup phase 
in practice). Note that the NIVerify function relies on these clock 
offsets to judge if the proof is valid. 


Security properties: The completeness definition is the same as 
before, except that no challenges are issued by the verifier. 

The changes to the security experiments related to soundness 
are also minimal. The setup experiment is same as before, except 
that the public information pp could also contain extra information 
such as anchor public keys. The challenge experiment now does not 
involve sending challenges to the prover. Instead, the prover com- 
putes NIPoGeoRet proofs itself, and submits a proof at the end of 
an epoch. This proof is verified using NIVerify. And the soundness 
definition is the same as before. 


B.2 Rational file retention 


We expand on the rational file retention assumptions made in 
Sec. 3.5.2 now. The device API is largely same as in Fig. 3 except a 
change to the createDevice function. The greedy storage algorithm 
greedyFill is in Fig. 13. 


Modified device oracle Oey 


State: A region R'. Key-value pairs D [did] = (loc,cap, mem) where the 

key did is the device identifier, loc is its location, cap is the memory capacity 

and mem is a list denoting the memory. A list "°° to track inputs to the 

commitment oracle O% m- 

2: init(R): Set R" =R. Not callable by the adversary. 

3: createDevice(did,loc,cap): If did € D return L. Set D[did] =(loc,cap,_). 
exec(did,func) — out: If did ¢ D return L. Compute func and return its out- 

4: put. func can read / write to D[did].mem or call any of Ogey functions inter- 

nally. If Om is called with input u and D[did].loc € R", do p"°°.append(p). 

sendTo(did;,did2,data): If data € D[did,].mem, 

D|did2].mem.append(data). 

6: erase(did,j): Erase index j, D[did].mem.erase(j). Do D[did].cap—=1. 


7: seenInROU(p2"): Return 1 if Yu € p°", u € ps" holds. 


Greedy allocation algorithm greedyFill(mem) 


cur=0 
for R in{R',R°'} 
for did inD 
available = D[did].cap—|D[did].mem| 
if available <0 continue 
if D[did].loc €R then 
D|{did].mem.append(mem[cur: cur+available]) 
cur+=available 
endif 
endfor 


endfor 


Figure 13: The modified device API. 


#sectors | Time (ms) 


64 0.89 (0.01) 
1.53 (0.01) 
2.99 (0.01) 


128 
256 


Table 7: Time taken for SW PoRet commit without the file read oper- 
ation as a function of the number of sectors. Averaged over 10 runs. 


Note that the memory is composed of many blocks. The size of 
each block is dependent on the protocol. For the GoAT protocol, 
this should be the same as the block size used in the PoRet so that 
all bits in a block are in the same device. 


C PRACTICAL 
CONSIDERATIONS AND FUTURE WORK 


C.0.1 Parameterization trade-offs. We discuss various trade-offs 
arising in GoAT parameterization now. 

The number of sectors s impacts the proof sizes, geolocation 
quality and the storage overhead. Higher s leads to reduced storage 
overhead but at the cost of relatively poorer geolocation and worse 
proof size. Note that higher s leads to increased PoRet commit times 
and thereby worse geolocation (eq. (1)). 

The number of challenges k and the code rate p need to be set 
following the constraint given in Thm. 1. As shown in Sec. 5.1.3, 
for practical values of p, the number of challenges is around 200. k 
and p impact geolocation quality and storage overhead respectively. 


There is a direct trade-off between the two — higher code rate (p) 
leads to less storage overhead but requires setting a higher number 
of challenges (k), which leads to higher PoRet commit times and 
worse geolocation. 


C.0.2 Anchor clocks. For GoAT to work, we assume that the 
clock drift of anchors is negligible. This assumption was made to 
ensure that clock offsets can be observed once and used later on, 
avoiding the need for any time synchronization. Clock drifts in 
practice tend to be much smaller than the interval lengths in GoAT, 
and hence this assumption is reasonable. 


C.0.3 Finding anchors. In Sec. 5, we used just the basic require- 
ments in deciding whether a given internet server can be used as an 
anchor. In practice though, other considerations such as reliability 
(does the anchor have stable response times) and trustworthiness (is 
the anchor reputable enough) will have to be taken into account. As 
noted before, if relying on existing internet servers is undesirable, 
anchors for GoAT can be purpose-built. 


C.0.4 Optimizations to improve geolocation accuracy. One set 
of ideas is related to improving the network model. Our current 
network model is unified, i.e., it assumes the network conditions 
across the globe are same for simplicity. Taking endpoint locations 
into account can improve geolocation quality in areas with better 
connectivity. Moreover a network model that avoids the blanket 
use of a startup cost tstart (we set it to 5ms) is desirable given that it 
causes upto 2x worse geolocation for nearby anchors. In our small 
measurement study, we found a lot of variance in the round trip 
times for nearby locations. But since GoAT can deal with short- 
lived network variances better due to the use of flexible-challenge 
model, a smaller value for tstart could be used. More experiments 
to understand if this idea can be used in practice are needed. 

Another idea is to optimize the PoRet commit compute time (we 
set it to 2ms). This can for example be done by finding a pairing- 
friendly curve that has faster vector commit times and optimizing 
code runtime. 

With regards to the choice of anchors, using Roughtime servers 
is clearly beneficial if possible due to their low processing times. 
Otherwise finding TLS servers that respond quickly is suggested, 
i.e., have low processing times. Overall Roughtime is a better choice 
of anchor, both from a performance perspective and an ethical stand- 
point since our use of TLS might be seen as abusing it. We hope 
that Roughtime gains more adoption in the future. 

Several other optimization opportunities exist: reducing the 
client processing time by optimizing client-code (we allocate 1.5ms 
which could potentially be reduced to almost zero), using an anchor- 
specific model for processing times, and perhaps even deploying 
new anchors with fast connectivity and low processing times. 


C.0.5 Constructing a proof-of-space. One potentially impactful 
research direction is to extend GoAT to construct a Proof-of-Space. 
Currently, GoAT can only prove that a file F is geographically 
retrievable from a set of different regions. But if the file F is ad- 
versarially chosen, the prover might only actually need to store a 
small seed. 


D SECURITY PROOFS 


We now provide a proof sketch for Thm. 1. We primarily focus 
on GoAT-P with a Roughtime anchor adding notes about how the 
proof extends to GoAT-H (or) to TLS anchors where needed. 

Recall that the GoAT-P proof 28° consists of I geo-commitments 
and a PoRet proof. Each geo-commitment C8®° consists of a+1 an- 
chor transcripts and all but the first transcript contain a PoRet 
commitment. In total, N = Ia PoRet commitments are in a proof. 
Similarly, the GoAT-H proof consists of N=Ia PoRet proofs and I 
geo-commitments. 

We prove soundness of GoAT in four steps. 


(1) Prove that the N PoRet commitments and the PoRet proof(s) 
are correctly computed, i.e., the PoRet verification protocol 
(PoRet.Verify) part of Verify must detect otherwise. 

(2) A combination of timing and knowledge based arguments to 
prove that the Commit operation is run on a device inside RÌ”, 
i.e., prove that all file fragments part of a correct proof must 
have been queried to the commitment oracle OÏ? m. 

(3) Prove that the extraction algorithm can efficiently reconstruct 
p fraction of file blocks from the fragments in each of the I 
snapshots {S}. 

(4) Prove that the file can be reconstructed from any p fraction. 


The proof for part 4 follows directly from the properties of a 
rate-p erasure code, so we do not expand on it further. 


D.1 Part-two proof 


For this part, we need to prove that the commitment oracle OH m 
receives all file fragments that are part of a correct PoRet com- 
mitment, proof. (The latter is guaranteed by the part-one proof 
provided later.) 

We proceed in two steps. First we argue that the only way of 
computing a valid PoRet commitment is by computing Commit on 
valid file fragments. This relies on the KEV assumption (See Def. 2) 
for GoAT-P and the ROM for GoAT-H. Next we argue that all calls to 
Commit must take place from within the desired target region R'". 
This relies on a timing based argument. Overall, this proves that if 
correct PoRet commitments and proofs are computed, then the com- 
mitment oracle (Oi",,) records all the corresponding file fragments. 

The proof for first step is as follows. Given a valid PoRet com- 
mitment C, for SW-P, we need to prove the existence of a valid 
pre-image p. But the KEVA directly offers this. We can use the 
extractor provided by the assumption to efficiently extract p for 
every valid PoRet commitment. 

The proof for the second part is given below. We provide two 
arguments based on whether a high-resolution / low-resolution 
anchor is used. We begin with the high resolution setting. 

As noted before, we assume that the clock offsets of all anchors 
are observed apriori and that clock drift is negligible. So we can 
safely assume that the anchor timestamps lie inside the expected 
interval, as otherwise the geo-commit verification would detect. 


D.1.1 High-resolution anchors (a= 1). Fixing some notation, 
assume that the storage provider P is at a location Lp e Riarget and 
that the anchor assigned to Lp is A, located at L4. Recall that the 
target region in GoAT is a spherical circle centered at L4 with radius 


5 = A(Lp,L1)-Smax/2, i.e., the region R" = (L1;ô). Expanding the 
radius further we have, ô= (tcom+tttmax(Lp,L1) +tproc) *(Smax/2). 

Recall that in the case of high-resolution anchors, the prover 
computes one PoRet commitment per interval. We want to prove 
that all the J PoRet commitments are computed on some device in 
RI”. Assume the contrary, i.e., say there exists a device Dout situated 
at Lz € Rt on which one of the PoRet commitments is computed. 
By definition we have dist(Lj,Lz2) > ô. 

Without loss of generality, assume that a copy of the encoded 
file F* (generated during the setup experiment) exists in its entirety 
in the memory of Dout, and therefore the time taken to compute 
commitment on Dout is negligible, i.e., tfn = 0. We also set the 
anchor processing time fod =0. 

The time taken to receive and respond from Dout during the 
geo-commitment protocol with A is given by z=2dist(L1,L2) /Smax- 
This is because in Fig. 8 we derive challenges from anchor signa- 
tures, i.e., they arise at Lı and must reach Lz. We can assume that 
the adversary probability of guessing these challenges is negligible 
(requires breaking selective unforgeability of the signature scheme 
used by the anchor which happens with negligible probability). 

Note in particular that this value is irrespective of any other 
factors, e.g., the adversary’s strategy might be to place a device Din 
exactly at the anchor location Lj, and initiate the protocol from Din 
with challenges forwarded to Dout. Moreover, we do not include 
any startup cost when the adversary is sending messages between 
devices, so a =0. 

For the geo-commitment verification to succeed, it must be that 
z<A(Lp,L1). (See last step in Fig. 5 when a=1.) 

But we have a contradiction, as z must also satisfy z >25/Smax 
because dist(L1,L2) > ô. Substituting for ô we get z > A(Lp,11). 
Hence proved. o 


D.1.2 Low-resolution anchors (a>1) . The target region now 
has a slightly larger radius, ô = (T/a) - Smax/2. The proof is very 
similar to the previous case. The main difference now is that the 
verification algorithm checks if a+1 anchor transcripts have the 
same time. Therefore the prover is forced to execute a PoRet com- 
mitments sequentially. 

Recall that for low-resolution anchors, the prover computes a 
commitments every interval. Continuing in the same style as before, 
assume for contradiction that the prover tries to execute all the 
commitments in one of the intervals from Dout (Dout is setup in 
the same fashion as before). 


The time difference between last and first timestamp in NIGeoCommit 


is given by z=2adist(L1,L2)/Smax. Note that we are counting time 
from the moment anchor receives the first request to the moment 
anchor sends out the last response. 

To succeed in verification, it must be that z <T. Intuitively, this 
corresponds to a+1 timestamps having the same time. But we have 
a contradiction, as z must also satisfy z > 2a6/Smax, substituting 
for ô we get z >T. Hence proved. o 


Note on technique: One subtlety to note is that the following alter- 
nate amplification method that computes PoRet commitment only 
once does not work: ping, com, ping2, ping, ..., pinga, pingas1- 
At first sight it might seem like a reasonable approach as it can also 
fill up a large amount of time. 


But the proof does not go through because the adversary can 
decrease the time difference z as follows. Place Din negligibly close 
to A and initiate the protocol from it. Therefore, the time taken for 
all consecutive pings is negligible. In this case, the timestamp differ- 
ence will only be z=2dist(L1,L2)/Smax (incurred as the adversary 
would have to forward challenges required to compute com, from 
Din to Dout). 


D.2 Remaining proofs 


We now prove the remaining parts, part-one and part-three. 


D.2.1 Part-one proof. For this we reuse the proof for Theorem 
4.2 in [40]. They provide a series of games that prove that, except 
with negligible probability, no adversary ever causes a verifier to ac- 
cept in a PoRet instance, except by responding with values { Hj ho 
that are computed correctly (under the assumption that the com- 
putational Diffie-Hellman problem is hard in bilinear groups). This 
directly proves that if the challenger provides a challenge set S*, 
then the correctly computed output of SW.Prove and SW.Commit 
containing {Cumo} must be accepted by the verification algo- 
rithm SW. Verify. The only change we made is the extra vector 
commitment. Assuming that the binding property of the vector 
commitment scheme holds, this directly follows. 

The remaining thing to be proved is that all the individual PoRet 
commitments used to compute C = Cy are correctly computed. 
Assume for contradiction that some of them are not computed 
correctly. Observe that we derive random coefficients rj from the 
final PoRet commitment comy. These coefficients are used during 
verification to compute C as follows, C= IPA (com;)”™. Under the 
random oracle model, we can assume that the probability of prover 
guessing these coefficients beforehand is negligible. Note the two 
checks in SW.Verify: the commitment check (VC.Verify) and the 
pairing equation check. Assuming that the latter succeeds, that is 
the final commitment C is the same as that computed by an honest 
prover, then the only way prover can make VC. Verify succeed is by 
guessing the random coefficients correctly (or) by breaking com- 
mitment binding, both of which happen with negligible probability. 
Grinding concerns are discussed in the main body. 


D.2.2 Part-three proof. We re-purpose the extraction algorithm 
provided in the proof of Theorem 4.3 in [40]. [40] provides an extrac- 
tion algorithm that, given an adversary that answers e fraction of 
the queries correctly, can extract p fraction of the encoded file blocks 
provided that e—( pn)k /(n—k+1) is positive and non-negligible. 

Recall that our extraction algorithm Extract is composed of 
Extr.Derive and Extr.Assemble. And the extraction algorithm of 
[40] already follows this additional structure we impose. Querying 
the adversary corresponds to Extr.Derive and assembling the file 
from query responses corresponds to Extr.Assemble. 

The only change now is that extraction must succeed in every 
interval, i.e., Odey-seenInROU(y2"!) =1 Vie {1,2,...,.]} must pass for 
all the intervals. And the key question is how the new bandwidth 
constraint ¢ and grinding attacks (discussed in Sec. 4) impact the 
above theorem. 

Recall that the size of the encoded file is |F*|. Of this, due to 
grinding, at least g = (1 — (1 — 272) 1/%)1/K fraction is only stored 
inside R” and hence only that is available for extraction (æ is the 


grinding cap). And further, upto ¢ bytes (the bandwidth cap) of the 
g-sized fraction can be downloaded, and is hence unavailable. 

The idea in the proof of Theorem 4.3 of [40] is to query enough 
times and use linear algebraic techniques to recover file blocks from 
query responses. Queries are made randomly. Three types of queries 
are listed, and the fraction of type-1 queries (the useful ones that 
help recover file blocks) is e—w where w= (pn)k /(n—k+ 1k (omit- 
ting the negligible part of the equation caused by type-2 queries). 
The extractor needs pn <n type-1 queries to succeed, which hap- 
pens in O(n/(e—w)) time. 

The maximum number of blocks unavailable inside R' is given 
by y=( He) tn 1—g). Therefore the extractor needs more type-1 
queries to succeed, (pn+y). Note that we assume if a query chal- 
lenges a block that belongs to the unavailable portion in Sj, a special 
symbol “—1” is used in place of the file block, and the challenge 
response is computed. And by extracting (pn+y) blocks, we are 
guaranteed to have at least pn actual file blocks (removing the —1’s). 

The useful fraction of queries now is e — w where w = (pn+ 
y)*/(n—k+1)*. And assuming pn+y < n, extraction happens in 
O(n/(€-—w)) time, i.e., same order as before. One constraint we get 
is A <g-p. 

We want e- w to be positive and non-negligible. Therefore w 
needs to be negligibly small. Meaning (p+y/n)* (or) (p+ A + 


1 =a)" needs to be negligible. As noted above, the number of in- 
teractions required and the time to extract is the same order as in 
[40]. Oo 

Note that the above proof assumed that any p fraction of blocks 
can be used to extract the file F. This is not true for fast-codes 
(Sec. 3.5.2). But since we assume that the adversary only picks how 
many blocks to delete, and does not resort to strategic deletion of 
blocks, we can trivially extend the above proof to the setting of 
fast-codes by assuming that the adversary deletes blocks randomly. 
Please refer to Sec. 3.5.2 and App. B.2 for a discussion about this 


assumption. 


